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A Class of Generalized Hyperbolic Continuous Time 
Integrated Stochastic Volatility Likelihood Models 

Lancelot F. James and John W. Lau^ 

The Hong Kong University of Science and Technology and University of Bristol 


This paper discusses and analyzes a class of likelihood models which are based on two distributional 
innovations in financial models for stock returns. That is, the notion that the marginal distribution of 
aggregate returns of log-stock prices are well approximated by generalized hyperbolic distributions, and 
that volatility clustering can be handled by specifying the integrated volatility as a random process such 
as that proposed in a recent series of papers by Barndorff-Nielsen and Shephard (BNS). Indeed, the 
use of just the integrated Ornstein-Uhlenbeck(INT-OU) models of BNS serves to handle both features 
mentioned above. The BNS models produce likelihoods for aggregate returns which can be viewed 
as a subclass of latent regression models where one has n conditionally independent Normal random 
variables whose mean and variance are representable as linear functionals of a common unobserved 
Poisson random measure. James (2005b) recently obtains an exact analysis for such models yielding 
expressions of the likelihood in terms of quite tractable Fourier-Cosine integrals. Here, our idea is to 
analyze a class of likelihoods, which can be used for similar purposes, but where the latent regression 
models are based on n conditionally independent models with distributions belonging to a subclass 
of the generalized hyperbolic distributions and whose corresponding parameters are representable as 
linear functionals of a common unobserved Poisson random measure. Our models are perhaps most 
closely related to the Normal inverse Gaussian/GARCH/A-PARCH models of Brandorff-Nielsen (1997) 
and Jensen and Lunde (2001), where in our case the GARCH component is replaced by quantities 
such as INT-OU processes. It is seen that, importantly, such likelihood models exhibit quite different 
features structurally. Rather than Fourier-Gosine integrals, the exact analysis of these models yields 
characterizations in terms of random partitions of the integers which can be easily handled by Bayesian 
SIS/MCMC procedures similar to those which have been applied to Dirichlet/Gamma process mixture 
models. Importantly, these methods do not necessarily require the simulation of random measures. 

One nice feature of the model is that it allows for more flexibility in terms of modelling of external 
regression parameters. Our models may also be viewed as alternatives to closely related latent class 
GARGH models arising in financial economics. 

1 Introduction 

In financial economics, it is well known that Gaussian based models such as the Black- 
Scholes-Samuelson model for the log-stock prices of returns do not fit with empirical ob¬ 
servations when returns are observed over moderately sized intervals. For example, on a 
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daily basis. The Black-Scholes-Samuelson model may be described in terms of the following 
stochastic differential equation, 

(1) dx*{t) = {fi + Pa‘^)dt + adw{t) 

where x*{t) denotes the price level, represents a constant volatility and w{t) is Brownian 
Motion. When observed over i = 1,..., n equally spaced time intervals of length A > 0, one 
has that the aggregate returns x*{iA) — x*{{i — 1)A) are iid Normal random variables with 
mean and variance (/iA + cj^/3, cr^). In terms of statistical inference, this produces a classical 
Normal likelihood where estimation of parameters (/3, ^u) are straightforward. However, it 
is known that, while the model (1) is plausible for large A, when A is of moderate size 
the aggregate returns exhibit behavior more like that of semi-heavy tailed distributions. 
Moreover, these models exhibit a feature known as volatility persistence or elustering. This 
suggests that should be replaced by a dynamic random process which has correlated 
increments. See for instance Carr and Wu (2004), Carr,Geman, Madan and Yor (2003), 
Barndorff-Nielsen and Shephard (2001a,b), Duan (1995), and Engle (1982) for these points 
and various proposals to enhance (1). Here we shall focus on the model of Barndorff-Nielsen 
and Shephard (2001a, b) which we now describe. 


1.1 BNS model and likelihood 

A quite attractive model was introduced by Barndorff-Nielsen and Shephard (2001a, b). 
Their proposed continuous time stochastic volatility (SV) model is based on the following 
differential equation, 

(2) dx*{t) = {p + Pv{t))dt + v^^‘^{t)dw{t) 

where x*{t) denotes the price level, and v{t) is a stationary Ornstein-Uhlenbeck (OU) process 
which models the instantaneous volatility and is independent of w{t). The induced likelihood 
model, which is based on the integrated volatility r(f) = v(u)du, can be described as 
follows. Let Xi := x*(iA) — x*((i — 1)A), for i = 1,..., n denote a sequence of the returns 
of the log price of a stock observed over intervals of length A > 0. Additionally for each 
interval [(i — l)A,iA], let r* = r(iA) — r((i — 1)A). Now the model in (2) implies that 
Xi\Ti,(3,ia are conditionally independent with 

(3) Xi = iJ,A + TiP + rl^'^ei. 


where e* are independent standard Normal random variables. Hence if r depends on external 
parameters 6*, one is interested in estimating (^, /3, Q) based on the likelihood 


(4) ^BArs(X|/r,/3,0) = / 

Jw 


'^cl){Xi\pA + PTi,Ti) 


+ U=1 


f{Ti,...,Tn\6)dTi,...,dTn 


where, setting Ai = (A* — /xA), and A = n ^ Y17=i 


(j){Xi\fiA +PTi,Ti) = ^^^e AV(2Ti)g u/ 32/2 

V 27r 
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denotes a Normal density. We note that because of the complex dependence structure of the 
joint density /(ri,... ,Tn\0), the likelihood was thought to be intractable. Thus inhibiting 
full likelihood based statistical inference, for models involving quite arbitrary r. This is in 
contrast to the case of latent class GARCH models which are of considerable interest in 
financial economics[see for instance, Fiorentini, Sentana, and Shephard (2004)]. However, 
in a closely related recent paper James (2005b) shows that the likelihood (4), where the r* 
are further generalized to be linear functionals of a Poisson random measure, is tractable 
and can be expressed exactly in terms of multi-dimensional Fourier-cosine transforms. The 
implication is that in general one can use classical numerical techniques to evaluate the 
likelihood. Moreover, these expressions are similar to quantities which regularly appear in 
the math finance literature on option pricing and related areas. 

In this paper we offer another approach that still allows us to work with integrated OU 
processes and indeed more general objects. Our purpose is two fold. One to propose models 
which we believe are complementary to the above framework but exhibit quite different 
features, and in fact more flexibility in the sense of incorporating more general regression 
coefficients. Secondly, we believe that because these two models exhibit different features 
that this invites individuals of quite varying backgrounds to conduct research on similar 
topics. Of course, exact analysis of these two classes of models also allows one to more easily 
critique, compare and improve such models. One could also consider a third class of models 
based on a hybridization of the two models. 

The difference is that our approach yields expressions of the likelihood in terms of 
random partitions of the integers which can considered relatives of the Blackwell and 
MacQueen (1973) Polya urn distribution. Hence these models inherit many of the well- 
known features of Dirichlet/Gamma process mixture models and extensions addressed in 
James (2005a). Additionally, the posterior distribution of the random processes are also 
more in line with what happens for the case of Bayesian multiplicative intensity models, as it 
depends on the jumps of the underlying Poisson random measure. Moreover our models serve 
as an alternative to latent class GARCH models. In fact, one will see in the next section that 
our models are perhaps most closely related to the Normal inverse Gaussian/GARCH/A- 
PARCH models of Brandorff-Nielsen (1997) and Jensen and Lunde (2001), where we replace 
their GARCH/A-PARCH components by general r. 

2 A class of generalised hyperbolic integrated stochastic volatil¬ 
ity models 

Before we present the model, we note that many authors have fitted semi-heavy tailed 
models in finance by specifying cj^ in the Black-Scholes formula to be a generalized inverse 
Gaussian (GIG) distribution. Hence the aggregate returns are from a generalised Hyper¬ 
bolic (GH) distribution. We pause to describe this density which we shall use later. Let A, 

V and 5 be such that —oo < A < oo, while v and 5 are non-negative and not simultaneously 
0. As in Barndorff-Nielsen and Shephard(2001a), T is GIG(A, J, u) random variable if its 
density is of the form 


fGIG{t\\d,v) 



4 


Hyperbolic Integrated Volatility 


where Kx is a Bessel function. When (5 = 0 and A > 0, u > 0 , GIG(A, 0,u) equates with 
the Gamma distribution. When A < 0, (5 > 0 and u = 0, then GIG(A, 5, 0) is a reciprocal, or 
inverse Gamma distribution. Using the parametrization, A = —a, for a > 0, and b = (5^/2, 
yields the density of an inverse Gamma distribution with parameters, a, b. A special case of 
this is when A = —1/2 leading to a stable law of index 1/2. The inverse Gaussian distribution 
defined by setting A = —1/2,6 > 0, and u > 0 that is a GIG(—1/2, (5, u). The Hyperbolic 
distribution coincides with the case of A = 1. See Prause (1999) and Eberlein (2001) for some 
additional background and references. The additional innovation in, for instance, Barndorff- 
Nielsen and Shephard(2001a, b) is that modelling volatility as a random process, v{t), rather 
than a random variable, not only allows for semi-heavy-tailed models, but additionally 
induces serial dependence. 


2.1 The model and conditional likelihood 


We now describe a model which is a direct variant of (3) but is otherwise a subclass of 
a considerably more flexible but still tractable proposal which is given in section 2.3. Let 
Xg^t ■= x*{t)—x*{s) denote the aggregate return of the log stock price over some intervals, t] 
for 0 < s < t. Furthermore, define Tg^t = x{t) — r(s) and Ag^t = {x — p,{t — s)). Then given 
some filtration ^T(t) determined by r, and further depending on fl and p, we assume that 
Xg^t is conditionally independent of the past with density 


(5) /x,,t(a;|r,/3,/i) 


2 \ 1/2 




A+1/2 

K\+l/2{f^\J + Ag^t^)) 


for A > 0. 

It follows that the density (5), for fixed r, represents a subclass of generalized Hyper¬ 
bolic (GH) densities which reduces to the Student distribution when p = 0 and /? = 0, 
but otherwise is one of the well-defined limiting cases of the (GH) model[see for instance 
Prause (1999)]. We point out further that although this density, for fixed r, does not contain 
the Normal inverse Gaussian or Hyperbolic distribution, Prause (1999, p. 7-11) gives exam¬ 
ples where the densities in (5) provides a more plausible fit to the data than those models. 
However, due to the general distributional flexibility of r, these issues do not really concern 
us, and we shall further take A = 1 for additional tractability. That is to say estimation and 
model fitting will depend on parameters such as 9,(3 and p and the distributional features 
of r. 

It would appear that the models in (5), and its corresponding likelihood model, are 
considerably more complex than (2), (3) and (4). However for each increment one may 
write. 


( 6 ) 


Xg^t = tl{t - s) + {Tg^t/Z)(3 + {Tg^t/zY^‘^€g^t 


where Z is a Gamma random variable with shape A, which we set to 1, and scale 1, and 
€g^t is an independent standard Normal random variable. Moreover, by utilizing a change of 
variable W = Z/xg^f, the density (5) can be written as 


4){x\p{t 


s) + (3w ^,w ^)e ^'^“’^Tg^tdiv 


(7) 


0 
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2.2 The likelihood 


Let Zi be iid Gamma random variables with shape A = 1 and scale 1. Then under the setting 
of (4) our model translates into the case where Xi\Ti, jJ are conditionally independent and 
representable as 

( 8 ) Xi = fiA + iTi/Zi)P + 


Now using a change of variable Wi = Zj/r* results in a likelihood of X|//, f3,6 expressible 


as 


(9) ^{X\ia,P,0) = 

where 


Yl(i){Xi\nA + f3w- 


U=i 


E 


n< 

U=l 


Y\_dwi 


i=l 


( 10 ) 


E 


n- 

.i=l 


h' 

. 2=1 




2=1 


It is noteworthy that (9) also has the form, 




u> 

L2 = 1 


JJ/g/g(w^*|3/2, |/3|, \Ai\)dwi 


2=1 


where C(X.\n,f3) is determined from the Normal density and the GIG density. As we shall 
show the expression in (10) is easily handled by applying the results James (2005a, 2002). In 
closing this section notice that once r is integrated out in (9) that one has a model whereby 
Xi\wi,iJ,,P are independent 

( 11 ) Xi = ^A + w~^P + Wi~^^‘^€i 


Hence conditional on W = (ITi,..., VL„), the parameters (//,/3) are easily estimated by 
standard parametric methods. 

Remark 1. Note that the likelihood models, (4), analyzed in James (2005b), depended 
on r only through terms such as E [n”= ^ e rather than E . In analogy to 

survival analysis, the first expression can be thought of as the likelihood of a model where one 
only observes right-censored observations, where the latter may represent the appearance 
of both complete and censored observations. This creates a fundamental difference in their 
respective marginal analysis and structure. 


2.3 Model flexibility 

One important advantage of the present approach, over say the direct use of (4), is that we 
can more easily handle variations in the model. Briefly, we mention the following variation, 
which we believe helps address a question raised by M.G. Jones in the discussant section of 
Barndorff-Nielsen and Shephard (2001a, p. 225, 237), 


x,^fiA + {nlZif^p + {n/Zif/hi. 
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Owing to the same derivations above, this leads to a model whereby the Xi\wi,p.,f3 are 
independent 

Xi = pA + 

More generally, for known real numbers (ao, ai,..., afc) and possibly unknown (/3i ,..., Pk) 
our approach extends to models of the type 

k 

x,^pA + Y, inlZiT^pj + {niz^rei 

and beyond. This is seen by the fact the transformation Wi = Tijzi, yields the models 
Xi\wi,iJ,,P are independent 


k 

Xi = fiA + ^ ^ Wi Pj + Wi ^^Ci 
j=i 

Note that the marginal distribution of W|^,/3i ,..., Pk is the same as for the case of (11). 
To be quite clear, all our forthcoming results hold for this more general setting by replacing 
p{Xi\pA + Pw~^,w~^) with 


k 

i=i 


Obviously in this case the density (5) has to be replaced by a more general Normal-Gamma 
mixture, but is otherwise just as easy to implement. 

Remark 2. It is not difficult to deal with models where say is replaced by Tpe~'^^ 

for 0 < a < 1. Based on our approach one simply writes 

noo 

r. - e ay. 

An augmentation reveals that the likelihood would involve an additional n latent variables. 


3 Evaluation of the likelihood for general r 

Similar to James (2005b) we now evaluate the likelihood in the case where r are more gener¬ 
ally modeled as linear functionals of a Poisson random measure defined over Polish spaces. 
Let N denote a Poisson random measure on some Polish space X with mean intensity, 

E[iV(du)] = v{dv). 

We denote the Poisson law of N with intensity v as P(dV|i/). The Laplace functional for N 
is defined as 

IE[e-^(/)] = f e“^('^)p((iV|z/) = 

Jm 
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where for any positive /, N{f) = Jy f{x)N{dx) and A(/) = fy(l — e~^^^^)iy'(dx). M denotes 
the space of boundedly finite measures on iF [see Daley and Vere-Jones (1988)]. We suppose 
that Ti = N{fi), for i = 1,..., n where /i,..., are positive measurable functions on y. 
Notice now that the index i = need not correspond to fixed intervals involving 

A. With this in mind, let {wi,... ,Wn) denote arbitrary non-negative numbers. We shall 
assume throughout that /i,..., /„ are such that Wifi) < oo. Notice first that one 

can write 


n 

2 = 1 




n / f{yi)N{dVi) 




Removing the integrals, one can treat the V = (Vi,...,V)i) as missing values taking 
their values in Y. Similar to the case of the Blackwell-MacQueen distribution, which 
plays a fundamental role in Dirichlet and Gamma process mixture models [see Lo (1984) 
and Lo and Weng (1989), Ishwaran and James (2004) and James (2005a)], we can ex¬ 
press the V as follows. Let V* = (L^*,..., denote the n(p) < n, distinct values 

of V, where p = {Ci,..., C'„(p)} denotes a partition of the integers {1,2,..., n},with cells 
Cj = {i : Vi = V*} for j = 1,..., n(p). Additionally, let Cj^n-, sometimes written as Cj, denote 
the size, or cardinality, of the cell Cj. Define D„(u) = Wifi{v). Let P((iA|t'n„, V) cor¬ 
respond to the law of the random measure J- Jy.*, where conditional on (V, W), 

is a Poisson random measure with mean intensity i'Q^{dv) := e~^^y^i'{dv). Now an 
application of James (2005a, Proposition 2.3), augmenting (9), yields a joint distribution of 
(V,W, N, X) which is expressible as, 

r«(p) 


(12) P(diV|yn„,V) 




L2=1 


Yl e"^"(^/V((iR/) 


where 


Note also that 




i=i 


,-7V(Er=i«'i/d 


gA(W) +/3R4-1, W.-1) 


2=1 


= / e 
Jm 




'R')p((ilV|z/). 


^{dV\mJ = mAdVi)ll 


i=2 


n{pi-i) 

mAdy)+ E 

i=i 


n(p) 

i=i 


-n„{v*) 


n{dV* 


corresponds to the n-th moment measure of a Poisson random measure with intensity , 
and importantly has a structure similar to the Blackwell-MacQueen distribution. The ex¬ 
pression Pi_i corresponds to a partition of the integers {1,... ,f — 1} for i > 2. Now inte¬ 
grating out (V, W, A) in (12) leads to an expression for the likelihood. 

Theorem 3.1 Suppose that Ti = N{fi) for i = l,...,n where N is a Poisson random 
measure on Y with intensity v. Then the likelihood (9) can he expressed as 

„ n 

A^(X|;U,/J, 0) = / Jd{w)e~^‘^^"'^W^(l){Xi\pL/2^ +(3w~^,w~^)dwi 


2 = 1 


where ^(w) = OLi= Ep Jr [UieCj Mv)\ ^n^dv). Where 

E)p denotes the sum over all partitions of the integers {1,... ,n}.n 
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Theorem 3.2 Suppose that ti = N{fi) for i = where N is a Poisson random 

measure on 'Y with intensity v. Then augmenting the likelihood in Theorem 3.1 yields the 
joint posterior distribution 0 / V, W|X given by, 


dw) oc 


-i=l 


i=l 


In particular we have the following posterior distributions 

(i) 7r(dw|v,X) oc 0 ?=! fGiG{wi\3/2, |/3|, \j Aj + 2 fi{v*)) 

(ii) 7r(dv|w,X) oc [OILi/(^*)]/*K) 

(Hi) ^x('^w) = Wf(^.^4>{Xi\p/\ + (5 wY ,wY)dwi/j3,9) is the poste¬ 

rior density o/W|X. 


□ 

Proposition 3.1 The distribution 0 / V|W,X can be further described as follows. The 
posterior distribution 0 / V|p,W, X, is such that the unique values V* are conditionally 
independent with distributions 


P(y/ G dv\p,w,X) oc 


n 

i£Gj 


mAdv) for j = l,...,n(p). 


The posterior distribution o/p| W, X is given by 7r(p|w, X) oc Jy 


Ihec, JiVj 




3.1 Posterior distribution of parameters 

Upon examining the likelihood, one sees that Bayesian inference for parameters {p, (3, 9) can 
be implemented in a straightforward manner, along the lines of methods outlined for the 
Dirichlet/Gamma process semi-parametric mixture models. See Ishwaran and James (2004) 
for these ideas and further pertinent references. Specihcally, a straightforward application 
of Bayes rule yields the following results. 

Proposition 3.2 Suppose that r depends on a d-dimensional parameter 9. Then if q{d9), 
q{df3), q{p) denote independent priors for {(3,p,9), their posterior distributions can be writ¬ 
ten as follows 

(i) g(d/3|^,w,X) oc q{dp)e-^dH2Z7^,wi-nAp] 

(ii) q{dp\(I,w,lK) oc q{dp)e~^^i=i^i^i ^-nA0\ 

(Hi) g((i0|w, V, X) oc q{d9)e~^^'A^'^'> ^eidV*) 

□ 
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3.2 Posterior distribution of the process 

The above results describe the behavior of the finite-dimensional likelihood and parameters. 
It is useful to also obtain a description of the underlying random process given the data. 
This allows one to see directly how the data affects the overall process. Moreover, com¬ 
bined with the results in James (2005a), it provides a calculus for more general functionals. 
For notational simplicity we suppose that (/U, f5, 9) are fixed. The next result also follows 
immediately from an application of Fubini’s theorem and (12). 

Theorem 3.3 Suppose that a likelihood ofX. and the specifications for r and N are defined 
by the specifications in Theorem 3.1. Let FLn{x) = Wifiix)- Let = Y x (0, oo). Then 
the posterior distribution of N\X. is 



F{dN\un„,v)^x{dv,dw). 


In particular for any positive or integrable function g on M, 



[ giN)FidN\nn^,v) 
Jm 


dw) 


□ 


ly>n 


n(p) 

g{N + 


i^yi{dw, dw) 


3.3 A general posterior predictive density for the log price 

We now define a random variable similar to (6) which can be thought of as representing 
the log-price and give an explicit expression for its posterior density given X. The random 
variable is defined as, 

X = fiA+{f/Z)(5 + {f/zf^e, 

where A is just some non-negative number, f = N{f) for some positive function / such that 
Laplace transform of r exists, e is a standard Normal random variable. Now let nn+i(2;) = 
n„(x) + wf{x). 


Proposition 3.3 The posterior density of the log stock price given X is, /j^(x|/3, ;U, X) 
equal to, 


q{w\v,w)e [^(W+i) ^)dw 


<^x(dv,(iw). 




where g(u>| V, w) = f{v)e (dv) -F fi^j) 

Proof. The result follows from Theorem 3.3, and (7), using the following fact. 


Ar(/)e-^(-/)p(dAr|j.^^,V) / [iV(/) + V/(y;)]e-'^("'%(dA|z/oJ, 

Jm ^ 


n(p) 


where wN{f) = wf, and f^e = e A(W+i) A(W)]^ □ 
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3.4 Generalized Chinese Restanrant and Polya Urn procedures 

The results in the previous sections show that, viewing W as a parameter, these models are 
structurally similar to Bayesian semi-parametric mixture models based on multiplicative 
intensity likelihoods. As such, one can import computational sampling schemes described 
in Ishwaran and James (2004) and James (2005a) and references therein. One can deduce 
the necessary modifications for the general nonparametric setting in James (2005a) from 
the methods for the semi-parametric Gamma process setting described in Ishwaran and 
James (2004). In particular, this includes general semi-parametric analogues of Polya Urn 
Gibbs samplers and SIS procedures given by Escobar (1994), Liu (1996), and West, Muller 
and Escobar (1994), and the Gibbs sampling/SIS procedures based on a generalized weighted 
Ghinese restaurant process [see Lo, Brunner and Chan (1996) and Ishwaran and James (2003)] 
However, since these schemes are phrased primarily for completely random measures, which 
are a subclass of the models we look at here, we mention a few details for clarification in 
the general Poisson case. First note that sampling from the distribution of W, 9, (3, /i| V, X, 
described by Theorem 3.2 and Proposition 3.2, proceeds along the lines of well-known 
parametric procedures such as random walk Metropolis-Hastings. The task then remains 
to approximately sample V|W,0,/3,/x. The key fact, is that structurally these models are 
not markedly different than the Dirichlet/Gamma process mixture models based on the 
Blackwell-MacQueen Polya Urn distribution. In fact, a weighted Chinese restaurant SIS 
algorithm to sample Urn distributions derived from general Poisson random measures, such 
as that of V, has already been given in James (2002, section 2.3). This of course trans¬ 
lates into dual Gibbs sampling procedures. Here we sketch out the relevant probabilities to 
implement these type of schemes for the simulation from the distribution. 


7r(dV|W,/J,0,/x,X) oc 






n(p) 

n n /‘(u 


j—1 i^Cj 




Note that this distribution will typically not depend on (/?, p) except through W. Similar 
to James (2005a, equation 40), define for r = 0,..., n — 1 conditional probabilities. 


P(U +1 edx\Vr) 


, nipr) 

-^Xridx) + y 

Cr 



5v* (dx) 


where = {Ui,...,U}, Xr{dx) oc U+i(x)i/o„((ix) and Zo,r = Jy fr+i{x) i^n^idx) and 
lj,r{x) = fr+i{x) Additionally Cr = lo^r + Examining James (2005a, section 

4.4.) we see these are the ingredients to implement general analogues of the Polya Urn Gibbs 
Sampler and SIS procedures described by Escobar (1994) and Liu (1996). An acceleration 
step similar to West, Muller and Escobar (1994) can be implemented by using the description 
in Ishwaran and James (2004, p.180. Remark 2) combined with Proposition 3.1. Naturally, 
if one has a structure closer to completely random measures the Polya Urn type methods 
described in James (2005a, section 4.4.), which involve integrating the jump components in 
the V vector should be employed if possible. To get the Chinese restaurant type procedures 
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one samples partitions p based on probabilities derived from Zo,r and 


1 

n 

Jr 



mMx) for j = 1,... ,n(Pr) 


where p,. denotes a partition of the integers {1,... ,r} and each Cj^r = {i < t ■. Vi = V*} 

denotes the corresponding cells. Additionally, l{r) = Zo,r + h,r- See James (2002, 

section 2.3) for justification of these procedures. 


Remark 3. The main distinction, structurally, between the nonparametric multiplica¬ 
tive intensity models described in James (2005a) and the present semi-parametric setting, 
is the non-cancelation of the Laplace transform in the likelihood as it depends 

on external parameters (W,9). This point is addressed in Ishwaran and James (2004) for 
Gamma processes. 


4 Results for completely random measures 


Many processes r will be directly expressible as functionals of completely random measures, 
say pL. This is the case for models based on the integrated OU processes of Barndorff-Nielsen 
and Shephard (2001a,b), where p is the Background Driving Levy Process (BDLP). As such, 
refinements of the above results in that case can be deduced from James (2005a, section 
4). Note that a homogeneous completely random measure, say p, with no drift, has the 
representation p{dy) = uN{du,dy). Where, 'V = (0,oo) x '3^ for some Polish space 3^, 
n{du,dy) := p{du)p{dy). The measure p is the Levy density of a non-negative infinitely 
divisible random variable, T, with Laplace transform 

E 

where V’('^) = “ e~‘^^)p{du). One may then write Vi = {Ji,Yi) and V* = {Jj^n,Yj*), 

where {Jj,n) denotes the unique jumps of the process p, picked by a type of biased sampling. 
Moreover it follows that for measureable functions fi{u,y) = ugi{y), Ti := N{fi) = p{gi). 
Additionally for any / and g such that f{u,y) = ug{y), one has A(/) = il>{g{y))g{dy) < 
oo. Define, pn^{du\y) = e““^*=iand for / = 1,... ,n conditional cumulants 

COO 

(13) «^/(pn„|i/) = / v!‘pnMu\y)- 

Jo 

Theorem 4.1 Suppose that N is a Poisson random measure with intensity n{du,dy) = 
p{du)p{dy) on y = (0,oo) x W. Suppose that Ti := N{fi) = p{gi) for i = 1,... ,n. Then 
according to the model (9), one has the following results. 

(i) Setting V* = {Jj^n,Vj*), for j = l,...,n(p), eonditional on p,W,X, the pairs of 
random variables on Y are independent with distributions 


^{Jj,n £ du,Y* G (i?/|w,X) oc 


pn^{du\y) 

f^ejipnjy) 


l-^ejipQjy) 


n 3iiy) 

ieCj 


sidy), 
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where Jy UieCj = Jy KejipnJy) UieCj 9i{y) y{dy) := diCjlw) is the 

normalizing constant. 

(a) The posterior distribution of p,\p,'W,'K, is such that iJ.{dx) = ynn{dx)+Yl'j=i Jj,ndY*{dx) 
where given p, W, X, is a completely random measure determined by the law 
F{dN\nQ^), and the pairs {Jj^n,Yj*) o,re conditionally independent of with distri¬ 
bution described in [(i)] 

(Hi) The distribution o/p|W, X is proportional to 
(iv) The density o/W|p, X is 

n(p) n 

f{w\p,lC) (X e~ d‘3f'^lYi^i^i9i(y))v{dy) l?(C'j|w) 'Y\_4>{Yi\yIS.-\-,w~^) 

J=l _ i=l 

All the above results hold, in an obvious way, for the inhomogeneous case of v{du, dy) = 
p{du\y)r]{dy).a 

4.1 Remarks on implementation 

The following remarks address specifically the models in this section, but clearly have exten¬ 
sions to the general setting. Suppose the infinitely divisible random variable T has density 
/t, and hence for a unique p, it has Laplace transform e~‘^^ fT{t)dt = . It is then 

important to note that ni{pnr,\y) are for fixed y, the first I = cumulants of an 

infinitely divisible random variable with density, 

fn„{t\y) := 

Now defining the the first I = 1,... ,n moments as mi{y) = t^ffi^{t\y)dv, it follows that 

the cumulants may be calculated using the result of Theile. That is, 

KiipnJy) = mi{y) - y,_^^>^k{pnjy)mi_k{y)- 

k=l ^ ^ 

This indicates quite clearly, the important fact, that one need not have the specific form of 
the Levy density p to implement estimation procedures for our models. An interesting case 
would be where T has a log Normal distribution. Of course for models such as stable laws 
where p has a simple form and the probability density is generally complex, the converse is 
also true. 


4.2 Example: Generalized Gamma processes 


We now provide some details for one of the most tractable classes of models. An interest¬ 
ing class of measures are the family of generalized Gamma random measures discussed in 
Brix (1999). Using the description of Brix (1999), these are p processes with Levy measure 
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The values for a and b are restricted to satisfy 0 < a < 1 and 0 < 6 < oo or —oo < a < 0 
and 0 < b < oo. Different choices for a and b yield various subordinators. These include 
the stable subordinator when 6 = 0, the Gamma process subordinator when a = 0 and 
the inverse-Gaussian subordinator when 0 = 1/2 and 6 > 0. When a < 0 this results in 
a class of Gamma compound Poisson processes. It follows that, for a ^ 0, and 6 > 0, 
'^{Y.7=i^i9i{y)) = + '}27=iWigi{y)Y - The case of the Gamma process, a = 0, 

6 > 0 is a limiting case and results in the well-known expression '^i9i{y)) — + 

Sr=i Now, for all a and 6 , conditional on 1/*, p, W, X, each Jj^n is Gamma 

distributed with shape and scale parameters (ej - 0 , 6 - 1 - '^i9i0^^))- It follows that the 

joint moment measure of Y|W, X is, 

«-(p) n -(ej.n-o) 

llib + Y.^^9^{Y;)) vidY*) 

j=l i=l 

which is the key component in the sampling algorithms described in James (2005a). See 
Ishwaran and James (2004) for many details, including the usage of Blocked Gibbs Samplers, 
in the Gamma process semi-parametric setting which easily translates to the generalized 
Gamma class. 

5 Example: BNS-OU model 

For proper comparison with the models (4) as derived in James (2005b), it is interesting to 
look at how the Integrated OU model of Barndorff-Nielsen and Shephard (2001a,b, 2003) 
behave in this scenario. We shall refer to this model as the BNS-OU model. Throughout 
this section we shall take = (— 00 , 00 ). One may express the Barndorff-Nielsen and 
Shephard (2001 a, b) integrated OU process r as 

(14) r(t) = A“^[(l-e“^*) / e^^((iy)-f / {I - e~Y^~y'^)p,{dy)] 

J —GO J 0 

where u(0) := vq = fj,{dy), denotes the instantaneous volatility at time 0. The form 

in (14) is taken from Carr, Geman, Madan and Yor (2003, p. 365). It follows that for any 
s <t, [T{t) - r(s)] = y{gs,t) = IV(/s,t) where fs,t{u,y) = ugs,t{y) and \gs,t{y) equals, 

(15) e-^*(l - e-"(‘-^))e2'/i,<o} + (1 - + e-"^(l - e-"(‘-^))e"^/|o<,<4. 

The first component in (15) represents the contribution from vq. Specializing this to 
s = (i — 1 )A and t = iA one has r* = y{gi) = N{fi) where fi{u,y) = ugi{y) and further 

gi(.y) = 9i,i{y) +9i,2(.y) with, 

(16) 5*,i(y) = A-M(l - e-^(*^-"))/{d-i)A<,<.A} + e-N*-i)A(i _ 


«(p) N 

1^ r(ej^n - a) 

11 r(l-a) 


5'i,2(y) = A ^e ^^)e^^/{0<y<(j-i)A}- 


and 

(17) 
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Now for i = 1,..., n, set and define Vn+i = 0. 

Now notice that for any sequence of numbers, the simplest expression will be obtained 
by utilizing the following facts. 

n 

(18) ^ Wj[gj^i{y) + gj, 2 {y)] = for y < 0 

1=1 

and for i = 1 ,... , n 

n 

(19) '^Wj[gyi{y) + gp 2 {y)] = C(yki,i’i+i) for (i - 1)A < y < iA. 

1=1 

Where for each i, C(y|rt;j, r^+i) = [A“^rt;i(l—e“'^(*^“i'))+ri+ie'^^]. Then one has the following 
result which is a generalized version of James (2005b, Proposition 3.1) 

Proposition 5.1 For 0 < s <t, let T{t) — t{s) be defined by (14) and (15) Then the results 
of Theorem 4-1 hold with fi{u,y) = u[gi^i{y)+gig{y)], as described in (16) and (17). Suppose 
thatg{dy) :=g{y)dy, and define g\{i/S.,u) = ? 7 (zA + A“^ ln(l —n)). Now, in particular, using 
a change of variable, 

(i) = g-4-0(n)g-4-n(«;n) g-<I>iK|ri + l) 

(a) 4>(t(;i|ri+i) = //_g-AA A"V(i’i+ie^*^(l - u) + for i = 1,... ,n - 1 

(in) ^{wn) = /i^_g-AA A"V(A"^u’„u)^aM 2 ^ 

(iv) <ho(i’i) = Jo 'f’{f'i'a)g{l^u)^, where 
□ 

Additionally one has the following features which do not play a role in the analysis of 
James (2005b). First statements (18) and (19) imply that 

n 

PnAdu\y)= e—+ p{du). 

i=l 

Now with some abuse of notation write 

Prfidu\y) = e“'“^i®V((^ii) and Pwi,ri+i{du\y) = V(i^ii)- 

Hence for Z = 1,..., n, one can write in an obvious way, 

n 

l^lipnjy) ■= Kl{Pri\y)I{y< 0 } Kl{Pwi,ri+i\y)I{(i-l)A<y<iA}- 

i=l 

Let ij = minjgc'^., that is the minimal index in a cell Cj, then, 
n (9i,i {y)+9i,2 {y)) = a{j,A) (^( 2 /N):,ej)/|(.*_i)^<^<.*^| + e^^yi{y< 0 } + e^^^^d{o<y<{i*-i)A}) 

i&Cj 
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where 

= n a" >■ 

Now one has KejipnJy) OigCj (y) + 9 i ,2 (y)) /a(j, A) is equal to 

riy\i*,ej,w) = e''^*'Ke, (/3ri|y)-^{y<0} + ej)Ke, |y)^{(i*-l)A<y<i*A} 

3 3 ^ ^ 

+ ^ej(yt«fc,rfc+i|y)-^{(fc-l)A<y<fcA}- 

fc=l 


One can check these points easily by looking at [yi,i(y) + yi,2(y)][y«,i(y) +y«,2(y)] for any 
pair i <1. Additionally, 


n(p) 

an ■■= n 
i=i 

The above derivations yield the necessary specifications for the characterization of the per¬ 
tinent features of this model via Theorem 4.1. However, our derivations above importantly 
reveal a much more refined structure which has been associated with Bayesian mixture 
models for monotone hazards and densities. That is the works of Dykstra and Laud (1981), 
Lo and Weng (1989), Brunner and Lo (1989), Ho (2002) and Ho (2005). 

5.1 A combinatorial reduction in terms of s-paths 

Similar to Brunner and Lo (1989, p.l553), let m = (mi,... ,mn) denote a vector of non¬ 
negative integers taking values in H := {m| J2i=i "OT-i > jA ^ 3 ^ a — 1, X)r=i = a}. 
Then each m* denotes the size of the cell whose minimal index is i. We also define binary 
random variables {.^i,..., where := 1 and in general = 1 if i is the minimal index 
of a cell, otherwise it is 0. Viewed in terms of sequentially sampling the random variables 
Y, the = 1 if V is distinct from the previous Yi,..., V-i random variables. This idea is 
a generalization of the Bernoulli random variables associated with the Blackwell-MacQueen 
distribution as discussed in Korwar and Hollander (1973). Hence, from the derivations in 
the previous section, one has 

n 

i=\ 

where := r{y\i,mi,-w)r]{dy) if m^ > 0 and otherwise, 9?(i,0) ;= 1. Note impor¬ 

tantly that the size of the space S is considerably smaller than the space of all partitions 
p of the integers {1,... ,n}. Specifically a vector m contains information about the num¬ 
ber of unique values or non-empty cells n(p) := Co the size of each cell m^, and the 
minimal index of each cell. However one does not know precisely the indices I G Cj for each 
j = 1,..., n(p). This leads to a more simplified version of the results in Theorem 4.1. 


^(p) I- 

n ^(Cflw) =anT\ r{y\i*,ej,w)r]{dy) = 
j=i j=i 
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Proposition 5.2 Suppose that N is a Poisson random measure with intensity u{du, dy) = 
p{du)r]{dy) on Y = (0,oo) x '3^. Suppose that r is defined by (14)- Then the likelihood (9) 
can he expressed as 


^{y.\pi,(5,e) = an [ 

JR" 


n 

En {p{i, mi\w) 

_mGH 2=1 


n 

g-4>o(»’i) (l>{Xi\pL/\+(5w~^,w~^)dw 

2 = 1 


where for j = 0,... ,n are given in Proposition 5.1.0 


Proposition 5.3 Suppose that N is a Poisson random measure with intensity u{du, dy) = 
p{du)r]{dy) on Y = (0, oo) x . Suppose that r is defined by (If)- Let M = (Mi,... ,M„) 
denote the random vector corresponding to the observations m. Then, one has the following 
results. 


(i) The posterior distribution o/(J, Y)|M, W,X consists ofn{p) = unique values 

{{Ji,Yi) : (,i = 1,1 < i < n} which are conditionally independent with respective 
distributions, 


P(Ji G du,Yi G (iy|m, w,X) 


u^*PQ.Sdu\y) r{y\i,mi,vj)p{dy) 
nmfipnfiy) 


(ii) The distribution o/M|W,X is given by 

n 

P(Mi = mi,... ,Mn = mn|w, X) oc ip{i, mj |w) for m G H. 

2=1 


(Hi) The density o/W|M,X is 


n 

2=1 


□ 


Proof. The proof is straightforward. It essentially follows from a relabeling of the 
components in Theorem 4.1, combined with the form of the likelihood in Proposition 5.20 

We next describe the posterior distribution of the integrated OU process. 

Proposition 5.4 Suppose that r is defined by (If); then the posterior distribution of 
T |M, W,X is equivalent to the conditional distribution of the random measure, 

rt ^ 

Tnit) = A“1[(1 - e-^^)vo,n + / (1 “ )//o„(dy) 

i=i 

where vo,n ■= /-oo ^^T^nidy) + ^idi^^’'d{Yi<0} has the posterior distribution ofvo.O 
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Proof. This result follows from Theorem 4.1 using the fact that = 

(2005) for a similar argument.□ 

The structure m G H is in one to one correspondence to what are called s-paths by 
Brunner and Lo (1989). See that work, in particular Brunner and Lo (1989, Lemma 2.1, 
Theorem 2.1) for slightly different representations. We close by noting that we are rather 
surprised that the BNS-OU model used in our likelihood structure generates s-paths. As 
mentioned earlier, s-paths are known to be generated by representing monotone hazard rates 
^{t<u}h{du) where // is a general completely random measure. This is the formula¬ 
tion recently investigated by Ho (2005) where the Gamma process results of Dykstra and 
Laud (1981), and Lo and Weng (1989) are special cases. Naturally, these are closely con¬ 
nected to the symmetric unimodal density Dirichlet mixture models considered by Brunner 
and Lo (1989). The BNS-OU models represent the first non-trivial mixture models out¬ 
side the above mentioned class where inference can be based on sampling solely s-paths 
or equivalently m, rather than p. This is an important fact, since while indeed sampling 
partitions p is not difficult, these models are significantly less complex than models which 
can be minimally expressed in terms of partitions. That is, the space of partitions of the 
integers {1,... ,n} is known to contain Bell’s number of terms which is approximately n! 
and is considerably larger than H. Ho (2005), has devised efficient computational procedures 
for sampling s-path models which can be easily imported to the present setting. See also 
Ho (2002). 
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